OVERV[DIS,DBL]1 - www.SailDart.org

perm filename OVERV[DIS,DBL]1 blob sn#209291 filedate 1976-04-05 generic text, type C, neo UTF8

COMMENT ⊗ VALID 00009 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 00002 .NSECP(Overview)
C00003 00003 .SSEC(Three-page Summary of the Project)
C00023 00004 . SSSEC(1-sentence summary of the thesis)
C00024 00005 .SSEC(Motivation: why is this worthwhile?)
C00026 00006 .SSEC(Fifteen-page Summary of the entire project)
C00027 00007 . SSSEC(HISTORIAN'S SUMMARY of AM's researches)
C00033 00008 .SSEC(Guide to reading the remainder of the thesis)
C00034 00009 . SSSEC(VARIED READERSHIP of this thesis)
C00039 ENDMK
C⊗;
.NSECP(Overview)

.SSEC(Three-page Summary of the Project)

Scientists often face the difficult task of formulating research
problems which must be soluble yet nontrivial. In any given branch
of science, it is usually easier to tackle a specific given problem
than to propose interesting yet managable new questions to
investigate. For example, contrast ⊗4solving⊗* the Missionaries and
Cannibals problem with the more ill-defined reasoning which led to
⊗4inventing⊗* it.

This thesis is concerned with creative theory formation in
mathematics: how to propose interesting new concepts and plausible
hypotheses connecting them. The experimental vehicle of my research
is a computer program called ⊗2AM⊗* (for ⊗2↓_A_↓⊗*pprentice
⊗2↓_M_↓⊗*athematician), which carries out some of the activities
involved in mathematical research: noticing simple relationships in
empirical data, formulating new definitions out of existing ones,
proposing some plausible conjectures, and evaluating the aesthetic
"interestingness" of new concepts.

Before discussing how to ⊗4synthesize⊗* a new theory, consider
briefly how to ⊗4analyze⊗* one, how to construct a plausible chain of
reasoning which terminates in a given discovery. One can do this by
working backwards, by reducing the creative act to simpler and
simpler creative acts. For example, consider the concept of prime
numbers. How might one be led to define such a notion? Notice the
following plausible strategy:

.ONCE INDENT 9,9,9 SELECT 6

"If f is a function which transforms elements of A into elements of
B, and B is ordered, then consider just those members of A which are
transformed into ⊗4extremal⊗* elements of B. This set is an
interesting subset of A."

When f(x) means "factors of x", and the ordering is "by length", this
heuristic says to consider those numbers which have a minimal$$ The
other extreme, numbers with a MAXIMAL number of factors, was also
proposed by AM as worth investigating. This led AM to many
interesting questions; the only "new-to-Mankind" mathematical result
so far is that numbers of the following form are all
maximally-divisible: p⊗B1⊗*⊗Aa1⊗* p⊗B2⊗*⊗Aa2⊗* p⊗B3⊗*⊗Aa3⊗*...
p⊗Bk⊗*⊗Aak⊗*, where the p⊗Bi⊗*'s are the first k consecutive primes,
and the exponents a⊗Bi⊗* decrease with i, and the ratio of
(a⊗Bi⊗*+1)/(a⊗Bj⊗*+1) is approximately (as closely as is possibe for
integers) log(p⊗Bj⊗*)/log(p⊗Bi⊗*). For example, a typical
divisor-rich number is
n=2⊗A8⊗*3⊗A5⊗*5⊗A3⊗*7⊗A2⊗*11⊗A2⊗*13⊗A1⊗*17⊗A1⊗*19⊗A1⊗*23⊗A1⊗*29⊗A1⊗*31⊗A1⊗*37⊗A1⊗*41⊗A1⊗*43⊗A1⊗*47⊗A1⊗*53⊗A1⊗*.
The progression of its exponents+1 (9 6 4 3 3 2 2 2 2 2 2 2 2 2 2 2)
is about as close as one can get to satisfying the "logarithm"
constraint. This number n has 3,981,312 divisors. The "AM
Conjecture" is that no number smaller than n has that many divisors.
By the way, this n equals 25,608,675,584. The empirical data
gathered indicated the general character of the declining exponents
to AM, but the precise formulation of the "logs" constraint was
beyond AM, since it doesn't know about logs. In fact, this was the
warped way in which AM first defined a concept like the one we call
"logarithm". The proof of the conjecture involves calculus, and for
that reason was also far beyond the abilities of AM. This example
shows the usefulness of a mechanical "co-researcher", the
effectiveness of AM--human cooperation.$ number of factors -- that
is, the primes. So this rule actually ⊗4reduces⊗* our task from
"proposing the concept of prime numbers" to the more elementary
problems of "discovering cardinality" and "inventing factorization".

But suppose we know this general rule: ⊗6"If f is an interesting
function, consider its inverse."⊗* It reduces the task of discovering
factorization to the simpler task of discovering multiplication.
Eventually, this task reduces to the discovery of very basic notions,
like substitution, set-union, and equality. To explain how a given
researcher might have made a given discovery, such an analysis is
continued until that inductive task is reduced to "discovering"
notions which the researcher already knew.

Suppose a large collection of these heuristics has been assembled
(e.g., by analyzing a great many discoveries, and writing down new
heuristic rules whenever necessary). Instead of using them to
⊗4explain⊗* how a given idea might have evolved, one can imagine
starting from a basic core of knowledge and "running" the heuristics
to ⊗4generate⊗* new concepts.

Such syntheses are precisely what AM does. The program consists of a
corpus of primitive mathematical concepts and a collection of guiding
heuristics (currently a couple hundred of each). AM's activities all
serve to expand AM itself$$ Incidentally, these basic concepts
include the operators which enlarge the space (e.g.,
COMPOSE-2-RELATIONS is both a concept in its own right and a way to
generate new ones). An important kind of heuristic knowledge that AM
possesses is how to create new heuristics to associate with new
concepts it defines. So AM can create new concepts and new
heuristics. $, to enlarge upon a given body of mathematical
knowledge. To cope with the enormity of the potential "search space"
involved, AM uses its heuristics as judgmental criteria to guide
development in the most promising direction. It appears that the
process of inventing worthwhile new$$ Typically, "new" means new to
AM, not to Mankind, and "worthwhile" can only be judged in hindsight.
$ concepts can be guided successfully using a collection of a few
hundred such heuristics.

Each concept is represented as a ⊗6BEING⊗* [Lenat], a frame-like data
structure with 30 different facets or slots. The types of facets
include: ⊗6Examples, Definitions, Generalizations, Domain/Range,
Analogies, Interestingness,⊗* and a couple dozen others. The
⊗6BEINGs⊗* representation provides a convenient scheme for organizing
the heuristics; for example, the following strategy fits into the
⊗4Examples⊗* facet of the ⊗4Predicate⊗* concept: "If, empirically, 10
times as many elements ⊗4fail⊗* some predicate P, as ⊗4satisfy⊗* it,
then some ⊗4generalization⊗* (weakened version) of P might be more
interesting than P". AM considers this suggestion after trying to
fill in examples of any predicate$$ In fact, after AM attempts to
find examples of SET-EQUALITY, so few are found that AM decides to
generalize that predicate. The result is the predicate which means
"Has-the-same-length-as" -- i.e., the rudiments of Cardinality. $.

AM is initially given a large collection of core concepts, with only
a few facets filled in for each. Its sole activity is to choose some
facet of some concept, and fill in that particular slot. In so
doing, new notions will often emerge. Uninteresting ones are
forgotten, mildly interesting ones are kept as parts of one facet of
one concept, and very interesting ones are granted full concept
status. Such new ⊗6Beings⊗* have dozens of blank parts, hence the
space of possible actions (blank slots to fill in) grows rapidly.
The same heuristics are used both to suggest new directions for
investigation, and to limit attention: both to grow and to prune.

The particular mathematical domains in which AM operates depend on
the choice of initial concepts. Currently, AM begins with scanty
knowledge of a couple hundred concepts which Piaget might describe as
⊗4prenumerical⊗*: Sets, substitution, relations, equality, and so on.
In particular, AM is not told anything about proof, single-valued
functions, or numbers. With this basis, AM quickly discovered
elementary numerical concepts (corresponding to those we refer to as
cardinality, multiplication, factors, and primes) and wandered around
in the domain of elementary number theory. Although it was never
able to ⊗4prove⊗* the unique factorization theorem, AM actually did
⊗4conjecture⊗* it$$ Due to the firm base of preliminary concepts
which AM developed, this relationship was almost obvious. AM sought
some predicate P which, for each n, some member of FACTORS-OF(n)
satisfied. ALL-PRIMES was such a predicate. AM next constructed the
relation which associates, to each number n, all factorizations of n
into primes. The full statement of the UFT is simply that this
relation is a function; i.e., it is defined and single-valued for all
numbers n. $. "Discovering" a concept means that (1) AM recognized
it as a distinguished entity (e.g., by formulating its definition)
and also (2) AM decided it was worth investigating (either because of
the interesting way it was formed, or because of surprising
preliminary empirical results).

AM was not able to discover any "new-to-Mankind" mathematics purely
on its own, but ⊗4has⊗* done so when working as a co-researcher with
a human⊗A1⊗*. AM noticed simple new concepts which mathematicians
had overlooked, but AM by itself was not able to precisely formulate
and prove interesting statements about those concepts. I conclude
that a synergetic AM--human combination can sometimes produce better
research than either could alone.

Everything that AM does can be viewed as testing the underlying body
of heuristics. Gradually, this knowledge becomes better organized,
its implications clearer. The resultant body of detailed heuristics
may be the germ of a more efficient programme for educating math
students than the current dogma$$ Currently, the educator takes the
very best work any mathematician has ever done, polishes it until its
brilliance is blinding, and presents it to the student to induce
upon. Many individuals (e.g., Knuth and Polya) have pointed out this
blunder. A few (e.g., Papert at MIT, Adams at Stanford) are
experimenting with more realistic strategies for "teaching"
creativity. $.

Another exciting prospect opened up by AM is that of experimentation:
one can vary the concepts AM starts with, vary the heurisitics
available, etc., and study the effects on AM's behavior. One
experiment involved substituting concepts in an entirely new domain
(plane geometry)$$ A new -- but bizarre -- result was obtained there:
any angle (between 0 and 180↑o) can be approximated to within one
degree, as the sum of two angles drawn from the set α{0↑o, 1↑o, 2↑o,
3↑o, 5↑o, 7↑o, 11↑o,..., 179↑oα}, i.e., the set of angles of prime
size. If our culture and technology were different, this might have
been an important well-known result. $.

One difficulty facing AM is the necessity to judge ⊗4a priori⊗* the
value of each new concept, to quickly lose interest in concepts which
aren't going to develop into anything. Often, such judgments can
only be based on hindsight. For similar reasons, AM has difficulty
formulating new heuristics which are relevant to the new concepts it
creates. Heuristics are often merely compiled hindsight. While AM's
"approach" to empirical research may be used in other scientific
domains, the main limitation (reliance on hindsight) will probably
recur.

. SSSEC(1-sentence summary of the thesis)

"Creative scientific discovery (at least in elementary mathematics)
can be successfully represented as a heuristic search process"

.SSEC(Motivation: why is this worthwhile?)

Inherent interest of getting a handle on the task (sci. creativity)
Personal belief that discovery can be (ought to be) demystified
Potential for learning, from the system, more about the process
of sci. concept formation, thy. formation, chance discovery
(do experiments on the implementations: eg, vary AM's heurs)
Potential usefulness of the implementations themselves (including AM)
Aids to research; i.e., ultimately: new discoveries.
Potential to education: like Mycin, extract heurs. and teach them
All the usual bad reasons:
"Look ma, no hands" + maternal drives + ego + thesis drives +...
Historical:
Need task with no specific goal, to test BEINGs ideas.
Disenchantment with theorem-provers that plod along, in contrast
to the processes which my model of math demands: intu, need,
aesth., multiple reprs, proposing vs proving, fixed task.

.SSEC(Fifteen-page Summary of the entire project)

<<first pass: condense the 20-page text of the 1-hour "whirlwind tour" talk>

<<Or: condense each section of this thesis; pull out all the overviews>

. SSSEC(HISTORIAN'S SUMMARY of AM's researches)

Before diving into the depths of AM, let's take a moment to discuss
the totality of the mathematics which AM carried out. Like a
contemporary historian summarizing the work of Euclid, we shall not
hesitate to use current terms, and criticize by current standards.

AM began its investigations with scanty knowledge of a few
set-theoretic concepts (sets, equaility of sets, set operations).
Most of the obvious set-theory relations (e.g., de Morgan's laws)
were eventually uncovered; since AM never fully understood abstract
algebra, the statement and verification of each of these was quite
obscure. AM never derived a formal notion of infinity, but it
naively established conjectures like "a set can never be a member of
itself", and procedures for making chains of new sets ("insert a set
into itself"). No sophisticated set theory (diagonalization,
ordinals) was ever done.

After this initial period of exploration, AM decided that "equality"
was worth generalizing, and thereby discovered the relation
"same-size-as". Cardinality was based on this, and soon most simple
arithmetic operations were defined. Since addition arose as an analog
to union, and multiplication as an analog to cross-product, it came
as quite a surprise when AM noticed that they were related (namely,
N+N=2xN). Soon after defining multiplication, AM investigated the
process of multiplying a number by itself: squaring. The inverse of
this turned out to be interesting, and led to the definition of
square-root.

Although AM was very close to discovering irrationals at this point,
it turned aside and was content to work with integer-square-root.
Raising to fourth-powers, and fourth-rooting, were discovered at this
time.

.ONCE TURN ON "{}"

The associativity and symmetry of multiplication indicated that it
could accept a BAG (a multiset) of numbers as its argument. This led
to the notion of factoring a number. Minimally-factorable numbers
turned out to be what we call primes. Maximally-factorable numbers
were also thought to be interesting at the time, and in fact an
unusual $$
These are the so-called "highly-composite" numbers of Ramanujan.
As far as the author and his committee know, this is the
first such explicit characterization of these numbers, hence is
probably "new-to-Mankind".
A similar (but slightly different) result has recently been noticed in
[Hardy] (p. 93).
Since the purpose of the thesis is not to
derive "new" mathematics, discussion of this result will be
minimized in this document. A short discussion will be provided in
Section {[2] MAXDIVSEC}.{[1] MAXDIVSSEC}, on page {[3] MAXDIVPAGE}.
$ characterization of such numbers was discovered.

AM conjectured the fundamental theorem of arithmetic (unique
factorization into primes) and Goldbach's conjecture (every even
number >2 is the sum of two primes) in a surprisingly symmetric way.
The unary representation of numbers gave way to a representation as a
bag of primes (based on unique factorization), but AM never thought
of exponential notation. Since the key concepts of modulus and
exponentiation were never discovered, progress in number theory was
arrested.

.SSEC(Guide to reading the remainder of the thesis)

i) Overall organization of the thesis
ii) Plans for what to read (and in what order), depending on your interests
Plan for those interested in the AI ideas
Plan for those interested in the systems ideas
Plan for those interested in mathematics
iii) Pre-requisites and how to satisfy them, for each chapter
For those with little pure mathematics in their background
For those with little computer science background
For computer scientists with little contact to AI before
<either organized by "type" of reader, or by chapter/section>

. SSSEC(VARIED READERSHIP of this thesis)

⊗7¬(A∨C∨M)⊗*

This thesis -- and its readers -- must come to grips with a very
interdisciplinary problem. For the reader whose background is in
Artificial Intelligence, most of the system's actions -- the
"mathematics" it does -- may seem inherently uninteresting. For the
mathematician, the word "LISP" signifies nothing beyond a speech
impediment (to Artificial Intelligence types it also connotes a
programming impediment). If I don't describe "LISP" the first time I
mention it, a large fraction of potential readers will never realize
that potential. If I ⊗4do⊗* stop to describe LISP, the other readers
will be bored. The standard solutions are to sacrifice one fixed
community in favor of the other, or to be entertaining enough to keep
both of them around. In this work, a third alternative will be taken.
Sections will be tagged with descriptors like "⊗7M⊗*" (indicating
that the section will be of interest to those who enjoy mathematics),
or "⊗7¬A⊗*" (the section will be a waste of time for those familiar
with Artificial Intelligence). The labels will consist of the letters
⊗7A⊗* (for ↓_A_↓I), ⊗7M⊗* (for ↓_M_↓ath), and ⊗7C⊗* (for general
↓_C_↓omputer science), connected by standard logical symbols: ⊗7¬⊗*
(negation), ⊗7∧⊗* (and), ⊗7∨⊗* (or). For example, since the current
paragraph is labelled ⊗7¬(A∨C∨M)⊗*, the reader can assume it is of
little interest to anyone.

In addition, there are two glossaries of terms. Appendix 1a contains
capsule descriptions of about 100 mathemtical terms, ideas, notations,
and jokes. Appendix 1b renders the analogous service for Artificial
Intelligence jargon and computer science concepts.